Crowdsourced Top-k Algorithms: An Experimental Evaluation

نویسندگان

  • Xiaohang Zhang
  • Guoliang Li
  • Jianhua Feng
چکیده

Crowdsourced top-k computation has attracted significant attention recently, thanks to emerging crowdsourcing platforms, e.g., Amazon Mechanical Turk and CrowdFlower. Crowdsourced top-k algorithms ask the crowd to compare the objects and infer the top-k objects based on the crowdsourced comparison results. The crowd may return incorrect answers, but traditional top-k algorithms cannot tolerate the errors from the crowd. To address this problem, the database and machine-learning communities have independently studied the crowdsourced top-k problem. The database community proposes the heuristic-based solutions while the machine-learning community proposes the learningbased methods (e.g., maximum likelihood estimation). However, these two types of techniques have not been compared systematically under the same experimental framework. Thus it is rather difficult for a practitioner to decide which algorithm should be adopted. Furthermore, the experimental evaluation of existing studies has several weaknesses. Some methods assume the crowd returns high-quality results and some algorithms are only tested on simulated experiments. To alleviate these limitations, in this paper we present a comprehensive comparison of crowdsourced top-k algorithms. Using various synthetic and real datasets, we evaluate each algorithm in terms of result quality and efficiency on real crowdsourcing platforms. We reveal the characteristics of different techniques and provide guidelines on selecting appropriate algorithms for various scenarios.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ARating-RankingMethod for Crowdsourced Top-k Computation

Crowdsourced top-k computation aims to utilize the human ability to identify top-k objects from a given set of objects. Most of existing studies employ a pairwise comparison based method, which first asks workers to compare each pair of objects and then infers the top-k results based on the pairwise comparison results. Obviously, it is quadratic to compare every object pair and these methods in...

متن کامل

Ensemble-based Top-k Recommender System Considering Incomplete Data

Recommender systems have been widely used in e-commerce applications. They are a subclass of information filtering system, used to either predict whether a user will prefer an item (prediction problem) or identify a set of k items that will be user-interest (Top-k recommendation problem). Demanding sufficient ratings to make robust predictions and suggesting qualified recommendations are two si...

متن کامل

Efficient Techniques for Crowdsourced Top-k Lists

We focus on the problem of obtaining top-k lists of items from larger itemsets, using human workers for doing comparisons among items. An example application is short-listing a large set of college applications using advanced students as workers. We describe novel efficient techniques and explore their tolerance to adversarial behavior and the tradeoffs among different measures of performance (...

متن کامل

A Confidence-Aware Top-k Query Processing Toolkit on Crowdsourcing

Ranking techniques have been widely used in ubiquitous applications like recommendation, information retrieval, etc. For ranking computation hostile but human friendly items, crowdsourcing is considered as an emerging technique to process the ranking by human power. However, there is a lack of an easy-to-use toolkit for answering crowdsourced top-k query with minimal effort. In this work, we de...

متن کامل

Anytime Measures for Top-k Algorithms

Top-k queries on large multi-attribute data sets are fundamental operations in information retrieval and ranking applications. In this paper, we initiate research on the anytime behavior of top-k algorithms. In particular, given specific top-k algorithms (TA and TASorted) we are interested in studying their progress toward identification of the correct result at any point during the algorithms’...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2016